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Summary 

Bis-(3',5') cyclic di-guanylate (c-di-GMP) is a key bac- 
terial second messenger that is implicated in the regu- 
lation of many crucial processes that include biofilm 
formation, motility and virulence. Cellular levels of 
c-di-GMP are controlled through synthesis by GGDEF 
domain diguanylate cyclases and degradation by two 
classes of phosphodiesterase with EAL or HD-GYP 
domains. Here, we have determined the structure of an 
enzymatically active HD-GYP domain protein from 
Persephonella marina (PmGH) alone, in complex with 
substrate (c-di-GMP) and final reaction product (GMP). 
The structures reveal a novel trinuclear iron binding 
site, which is implicated in catalysis and identify resi- 
dues involved in recognition of c-di-GMP. This struc- 
ture completes the picture of all domains involved in 
c-di-GMP metabolism and reveals that the HD-GYP 
family splits into two distinct subgroups containing 
bi- and trinuclear metal centres. 
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Introduction 

Bis-(3',5') cyclic di-guanylate (c-di-GMP) is a second mes- 
senger utilized by almost all eubacteria that acts to regulate 
a wide range of functions including developmental transi- 
tions, adhesion, biofilm formation, motility and the synthe- 
sis of virulence factors (Schirmer and Jenal, 2009; Boyd 
and OToole, 2012). C-di-GMP is synthesized from two 
GTP molecules by GGDEF domain-containing diguanylate 
cyclases (DGCs) and degraded by phosphodiesterases 
(PDEs) with either an EAL or HD-GYP domain (Ryan et al., 
2006; Hengge, 2009; Schirmer and Jenal, 2009; Boyd and 
OToole, 2012). Three-dimensional structures have been 
determined for GGDEF and EAL domains, and have 
afforded detailed insight into their roles in the turnover of 
c-di-GMP and regulatory interactions with other proteins 
(Hengge, 2009; Schirmer and Jenal, 2009; Boyd and 
OToole, 2012). In contrast, enzymatically active HD-GYP 
domain proteins, such as the paradigm RpfG from the plant 
pathogen Xanthomonas campestris (Ryan etal., 2006), 
have so far proved intractable to structure determination by 
X-ray diffraction. 

The structure of an unconventional catalytically inactive 
HD-GYP domain protein from Bdellovibrio bacteriovorans 
(Bd1817) has been determined however (Lovering etal., 
2011). This work identified a binuclear iron centre and the 
role of conserved residues within the HD-GYP family (to 
include the HD diad) in metal binding. The HD domain 
superfamily of enzymes, to which the HD-GYP family 
belongs, has been shown to catalyse phosphomonoester- 
ase and phosphodiesterase reactions depending on their 
catalytic metal centre being mono- or binuclear respec- 
tively (Aravind and Koonin, 1998; Galperin etal., 2001). 
The determination of the structure of Bd1817 may thus 
afford some insight into metal binding by enzymatically 
active HD-GYP domains, but the protein lacks the con- 
served tyrosine of the GYP motif and has no c-di-GMP 
phosphodiesterase activity, precluding insights into the 
role of the other conserved residues. 

Here we describe the first crystal structure of an 
enzymatically active HD-GYP phosphodiesterase protein, 
PmGH from Persephonella marina EX-H1, a thermophilic 
marine member of the Aquificales. PmGH comprises an 
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HD-GYP domain fused to a GAF domain. We have also 
determined structures for this protein in complex with its 
substrate (c-di-GMP) and final reaction product (GMP). 
The structures reveal the mode of binding of the di- 
nucleotide and shed light on the catalytic mechanism. A 
remarkable feature of the structure of PmGH was the 
identification of a trinuclear Fe centre which is buried at the 
bottom of the cavity forming the c-di-GMP binding site. 
Adequate space is available for the substrate to bind 
dynamically and interact with the metal centre to sequen- 
tially hydrolyse c-di-GMP to GMP. Knowledge of the con- 
served residues involved in binding to the novel trinuclear 
Fe centre together with the analysis of amino acid 
sequence alignment of a large cohort of HD-GYP domain 
proteins suggests further classification of this family into 
two distinct subgroups containing either a bi- or trinuclear 
metal centre. 

Results 

Target selection and structure determination 

Initial attempts to crystallize the archetypal HD-GYP 
domain protein RpfG from Xanthomonas campestris 
(Ryan et al., 2006) failed. An extensive search of structural 
homologues with standalone HD-GYP domains and/or in 
combination with different sensor and ligand binding 
domains was then made; this extensive list was rational- 
ized through bioinformatics analysis using criteria known to 
increase the success rate of crystallization (Slabinski et al., 
2007). Of a cohort of 1 5 proteins, a GAF/HD-GYP domain- 
containing protein from P. marina EX-H1 (PmGH) gave 
crystal hits that were further optimized for structure solu- 
tion. PmGH was shown to have c-di-GMP phosphodiester- 
ase activity both in the crystalline form (see below) and in 
solution (Fig. 1). The structure was solved by the single 
wavelength anomalous diffraction (SAD) technique using 
the anomalous signal from the bound iron (Table 1). 

Protein architecture 

PmGH forms a dimer with each monomer consisting of 
an N-terminal GAF domain connected to a C-terminal 
HD-GYP domain by an approximately 42 residue-long 
helix (oc5 in Fig. 2A). Assembly of the head-to-head dimer 
relies exclusively on the GAF domain and the long cc5 helix 
with the HD-GYP domain playing no role in the dimeric 
interface (Fig. 2A). The overall topology of the GAF domain 
is similar to that of other GAF domain structures (Ho et al., 
2000; Kanacher etal, 2002; Martinez etal, 2002), con- 
sisting of a six-stranded antiparallel p-sheet (P3-p2-p1 -p6- 
P5-P4) sandwiched by a three-helix bundle (oil , oc2 and a5) 
on one side and two short helices (oc3 and a4) on the other. 

The catalytic HD-GYP domain contains the charac- 
teristic HD domain superfamily 5-helix core formed by 



helices a6 to a10 that in turn provides the scaffold for 
sequestering the tri-iron centre through eight conserved 
protein side-chain ligands (Fig. 2B and Fig. 3). The signa- 
ture HD motif forms part of this octet (H221 and D222) 
and is located on a7 at a kink close to the C-terminal end 
of the helix. Another four conserved histidines provide 
metal ligands through H189 located at the start of a6, 
H250 from a8 and H276/277 at the end of a9. D305 from 
a10 completes the protein metal ligands of the tri-iron 
centre. In addition to the HD domain 5-helix core, the 
HD-GYP domain contains two extra C-terminal helices, a 
short helix oc11 and finally a12 which is significantly bent 
and allows it to pack against helices oc6 and oc1 0 (Fig. 2B). 

The HD-GYP domain of PmGH in its entirety resembles 
an opened two-clawed chela. One of the chela's claws is 
comprised of the loop (L7/8) connecting a7 and oc8 as well 
as the start of a8, whereas the other claw is entirely 
formed by the loop connecting helices a10 and oc11 (L10/ 
11) (Fig. 2B). The loop region connecting helices a9 and 
oc10 contains the signature GYP motif and forms a well- 
ordered structure made up of two orthogonally orientated 
U-turns as observed in the structure of an inactive 
HD-GYP domain protein (Lovering etal, 2011). The con- 
served Y285 of the GYP motif points towards the metal 
binding centre which is buried in the cavity formed by the 
chela. The 'GYP' loop forms a barrier on one side of the 
opened chela, whereas the other side is unobstructed. A 
sequence-based conserved motif in HD-GYP proteins 
mapping to loop L9/10 was previously identified to be 
HHExxDGxGYP (Ryan etal, 2006); however, the PmGH 
structure reveals that L9/10 is composed of 19 residues 
suggesting an extension of the consensus sequence to 
HHExxDGxGYPxxxxxxxl, which includes a conserved 
isoleucine residue (I294 in PmGH) that stabilizes the 
closure of the second U-turn by hydrophobic interactions 
with G284 from the GYP motif. 

Trinuclear Fe binding site 

The HD-GYP domain of PmGH contains a trinuclear metal 
centre that is located at the bottom of the cavity formed by 
the two claws of the open chela (Fig. 2B; Fig. 3A and B). 
Anomalous diffraction difference maps showed all three 
metal sites to be occupied by iron atoms, with anomalous 
peak heights in excess of 40 o separated by approximately 
3.5 A (Fig. 3C). The bond valence sum method (Brown and 
Altermatt, 1985) was used to estimate the oxidation states 
of the three Fe ions which assigned the two peripheral sites 
as occupied by Fe(ll) and the middle site by Fe(lll). Over- 
expression of protein from bacteria grown in minimal media 
and supplementing with divalent metal ions such as man- 
ganese resulted in mixed metal occupancies in the peri- 
pheral metal sites but the middle metal site was always 
observed to be occupied by Fe (Fig. 3C and Fig. S3). 
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Fig. 1. PDE activity of purified HD-GYP domain PmGH and variants with alanine substitutions. 

A. Representative HPLC traces showing standards (i), aliquots of reaction mixtures boiled at 0 min (ii) and after 60 min (iii) incubation with the 
PmGH protein. The identity of the product was confirmed by mass spectrometry. 

B. Effects of alanine substitutions in metal ligands (E185, H189, H221, D222, H250, H276, H277 and D305), a strongly conserved residue in 
the family (D308) and the two putative catalytic residues (D183 and K225) on cyclic di-GMP hydrolysis. 

C. Effects of alanine substitutions in residues in the GYP motif (G284, Y285 and P286), the conserved I294 position and other residues 
involved in substrate binding (R314 and K317) on cyclic di-GMP hydrolysis. 

D. Primary sequence alignment of HD-GYP domains of PmGH with some of the most well-characterized HD-GYP proteins, such as TM0186 
(Thermotoga maritima), RpfG (Xanthomonas campestris pv. campestris), Paerl -3 (Pseudomanas aeruginosa), BBur (Borrelia burgdorferi) and 
Bd1817 (Bdellovibrio bacteriovorus). Metal ligands, catalytic residues, substrate ligands and GYP motif, based on the PmGH structure, are 
highlighted in cyan, green, yellow and orange respectively. 



These data support the oxidation state assignment and 
indicate that the middle Fe site of the trinuclear centre is 
specific for Fe(lll), while the peripheral sites can accom- 
modate other metals under Fe deficient conditions. The 
tri-iron centre delineates the floor of the open chela with the 
two peripheral metal sites being in close proximity to Y285 
of the GYP motif on one side (the G-site) and H221 of the 
HD motif on the other side (the H-site). The middle iron 
binding site (M-site) is sandwiched between the G- and 
H-sites with Fe-Fe bond distances of 3.40 A and 3.67 A 



respectively (Fig. 2B and Fig. 3). All three metals are 
octahedrally co-ordinated (Fig. 3A and B). In PmGH the 
HD-GYP domain conserved residues E185, H189, H221, 
D222, H250, H276, H277 and D305 contribute to ligand 
binding with the metal co-ordination sphere completed by 
succinate and imidazole bound from the crystallization 
buffer as well as two solvent molecules which bridge the 
G-M and M-H metal pairs (Fig. 3 and Fig. S1 ). Both pairs of 
Fe sites, G-M and M-H, are triply bridged: the G-M pair by 
the carboxylate groups of D222 and a succinate from the 
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Data collection 
X-ray source 
Wavelength (A) 
Resolution range (A) a 
Space group 

Unit cell parameters 
a, b, c (A) 

a, P. Y (°) 
No. of observations 
No. of unique observations 

Emerge (%) 

Completeness (%) a 
Mean 1/aP- 
B wilson 

Refinement 

Pwork/ Pfree (%) 

Rms deviations, bonds/angles 
Average B factor (A 2 ) 
Ramachandran favoured 



PmGH Fe peak 



Diamond I24 
1.74 

71-2.70 (2.8-2.70) 
I222 

69.7, 180.8, 232.2 
90, 90, 90 
102 579 
38 405 

12.2 (72.1) 
98.7 (97.3) 

28.3 (5.3) 



PmGH native 



Diamond I02 
0.95 

72-2.03 (2.1-2.03) 
I222 

69.8, 182.4, 232.5 
90, 90, 90 
417 588 
95 384 
4.1 (56.0) 
99.8 (98.0) 
15.2 (2.3) 
52.42 

19/21 

0.011/1 .36° 

52.8 

96.8% 



PmGH c-di-GMP 



Diamond I02 
0.98 

66-2.68 (2.75-2.68) 
I222 

70.2, 183.0, 233.5 
90, 90, 90 
189 599 
42 290 
7.7 (68) 
99.4 (99.6) 
14.3 (1.9) 
69.14 

18/23 

0.013/1.72° 

45.2 

96.1% 



PmGH GMP 



Diamond 104-1 
0.92 

71-2.55 (2.62-2.55) 
I222 

70.1, 181.1, 231.5 
90, 90, 90 
254 040 
48 066 
8.6 (68) 
99.2 (98.7) 
12.7 (2.2) 
60.06 

18/22 

0.010/1.42° 

37.3 

96.4% 



a. Values in parentheses refer to the high resolution shell. Anomalous pairs were kept separate during merging of all datasets. 



crystallization buffer in a bidentate fashion and by a mon- 
odentate bridged solvent molecule; whereas the M-H pair 
is bidentately co-ordinated by the carboxylate groups of 
E185 and a second succinate ion and by another mon- 
odentate bridged solvent molecule (Fig. 3Aand B). Analy- 
sis of metal-ligand bond lengths for the bridging solvent 
molecules is consistent with the bridging ligand being a 
hydroxide ion. This correlates with other HD domains 
containing diiron centres (Brown era/., 2006; Lovering 
era/., 2011). 

C-di-GMP and GMP binding 

Structures of complexes of PmGH with GMP and c-di-GMP 
were determined from crystal soaking experiments with 
both nucleotides. Crystal soaks with GMP and c-di-GMP 
gave identical structures of PmGH in complex with GMP 
showing that PmGH retained phosphodiesterase activity 
in the crystal (Fig. 4A). However, pre-soaking of PmGH 
protein crystals in the presence of 100 mM EDTA followed 
by c-di-GMP revealed well defined difference electron 
density for a bound c-di-GMP molecule at the active site, 
displacing the metal bound succinate ions found in the 
nucleotide free structure. The c-di-GMP was modelled 
unambiguously with the two guanines in a cis conformation 
so that the molecule presents a V-shaped conformation 
when bound to PmGH (Fig. 4B). This is in contrast to what 
is observed in EAL domain proteins where c-di-GMP is 
bound in a more extended conformation (Navarro era/., 
2009). Crystal packing presents a less solvent exposed 
metal binding site for one of the monomers of the PmGH 
dimer and binding of nucleotide is observed only to 



monomer B. The structure reveals that only the G-site Fe 
remains bound in this monomer of PmGH in these EDTA 
treated crystals, whereas the other less accessible 
monomer subunit still has Fe bound at all three metal sites, 
but with reduced occupancy (~ 50%) for the M and H sites 
and shows no difference density for bound nucleotide. 

The bound c-di-GMP is buried within the large pocket 
formed primarily by the HD-GYP domain claws (65% of its 
accessible surface, 523 A 2 buried), with one of the phos- 
phate groups pointing towards the tri-iron centre, while the 
ribose and guanine bases are stacked against the claws 
of the chela (Fig. 4B and C). Superposition of the PmGH 
monomer bound to c-di-GMP, which has only the G-site Fe 
occupied, onto the tri-iron PmGH structure shows the 
PmGH c-di-GMP complex structure to be essentially 
unchanged on binding of c-di-GMP apart from E1 85 which 
is disordered due to the loss of the M-H Fe metal pair as 
well as better defined density for the residues of loop 
L10/11 which interact directly with the bound c-di-GMP. 
This analysis shows the bound c-di-GMP to interact with 
the middle Fe(lll) (M-site) through one of the non-bridging 
phosphate oxygens of one hydrolysable phosphate group 
and to the hydroxyl group of Y285, the signature Y residue 
of the GYP motif, through a hydrogen bonding interaction 
with a non-bridging oxygen of the other hydrolysable phos- 
phate (Fig. 4B). One of the guanine bases, base-1, forms 
three hydrogen bonds to PmGH, two through the C6 
carbonyl oxygen of the guanine with the NH1 guanidinium 
group of R314 and the NZ amine nitrogen of K317 and a 
third through the guanine N7 atom and the NH1 guani- 
dinium group of Arg-314 (Fig. 4B). Guanine base-2 on the 
other hand interacts with PmGH only through the C2 amine 
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Fig. 2. Structure of PmGH. 

A. Structure of the PmGH homodimer. Molecule A of the dimer is 
shown in ribbon representation with the GAF domain in green, the 
long inter-domain dimerization helix a5 in maroon, the core HD 
domain in cyan, the GYP motif-containing loop in red and the 
additional surface decorating a- helices which complete the 
HD-GYP domain in yellow. Molecule B of the dimer is shown in 
ribbon representation in orange with a semi-transparent surface. 
The trinuclear iron centre is shown as orange spheres. 

B. Detailed view of the HD-GYP domain of PmGH in ribbon 
presentation. Colour codes as in (A) with the addition of the HD 
and GYP motif residues shown in ball and stick. Labelling for 
a-helices and turns are shown. The central metal iron has been 
labelled as the middle site (M) and the two flanking metal sites as 
H and G, to reflect their proximity to the HD and GYP motifs 
respectively. 



group that makes a hydrogen bond to the main-chain 
carbonyl of K235. Hydrophobic interactions with Y44, A309 
and L310 complete the binding interactions of PmGH with 
c-di-GMP (Fig. S2). Comparison of GMP and c-di-GMP 
binding shows that the guanine base of the GMP molecule 
superposes with the guanine base-1 of c-di-GMP (Fig. 4D). 
In the case of the tri-iron PmGH-GMP complex, the GMP 
phosphate moiety bridges the M and H metal sites as 



opposed to the inferred sole interaction of one of the 
hydrolysable phosphate groups with the middle Fe M-site 
based on the crystal structure of the mononuclear Fe 
PmGH-c-di-GMP complex (Fig. 4A and D). 

Mutational analysis of the role of key residues 

Alanine substitutions of metal ligands in PmGH (E185A, 
H189A, H221A, D222A, H250A, H276A, H277A and 
D305A) essentially abolished the phosphodiesterase 
activity in all cases except for E185 and D305, where 




Fig. 3. The tri-iron metal centre of PmGH. Detailed views of the 
tri-iron centre showing the first co-ordination sphere for the G-M 
metal pair (A) and for the M-H metal pair (B) with the co-ordination 
of the H and G sites not shown in (A) and (B) respectively for 
clarity. Protein metal interactions are highlighted as black lines. Fe 
atoms are shown as orange spheres. Protein side-chain metal 
ligands are in stick mode, coloured by atom type with carbon in 
pelican, while the carbon atoms of the metal ligands from the 
crystallization buffer, 2 succinates SIN-1,2 and an imidazole ion 
(IMD), are in yellow. (C) Fe-specific difference DANO map (Than 
era/., 2005) in green and Mn anomalous difference map in red, 
both contoured at 0.043 eA-3. The angle subtended by the tri-iron 
centre is shown. Black dashed lines depict Fe-Fe distances, while 
yellow dashed lines indicate bond distances for the pair of 
u-hydroxides. Bond distances are in Angstroms. 
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Fig. 4. Substrate binding by PmGH. 

A. View of GMP shown in stick mode and coloured by atom type bound to PmGH. Bonding interactions are represented by dashed lines with 
distances in Angstroms. Difference electron density map for GMP is contoured at 2 a. 

B. View of cyclic di-GMP bound to a metal depleted subunit of PmGH. Electron density for cyclic di-GMP is from a bias-removed omit map 
contoured at 2 a. Fe atoms occupying the middle (M) and HD (H) sites are shown as semi transparent spheres as they are not present in the 
subunit (due to chelation by EDTA), and are taken from superposition of the equivalent metals from the high resolution structure of PmGH. 

C. Surface representation of the PmGH HDGYP domain monomer subunit showing the binding cavity for cyclic-di-GMP, which is represented 
in stick mode and coloured by atom type. 

D. Superposition of the structures of PmGH bound to cyclic di-GMP and GMP. Both nucleotides are shown in stick mode. 



although there is a marked reduction in their ability to 
hydrolyse c-di-GMP, hydrolysis of c-di-GMP is still detected 
(Fig. 1B). Mutations in the GYP motif and conserved resi- 
dues implicated in c-di-GMP recognition (G284A, Y285A, 
P286A, I294A, R314Aand K317A) did not however result 
in a substantial decrease in catalytic activity (Fig. 1C). 
Alanine mutation of other conserved residues near the 
metal centre (D183, D308 and K225) had a similar impact 
on activity as for metal ligand residues. D308 is located at 
the C-terminal of helix oc10 forming hydrogen bonds with 
R192 at the N-terminal of a5 and the metal ligand H189 
highlighting a structural role in stabilizing the metal centre 
and the HD fold, while K225 contributes to stabilization of 
the tri-iron centre through hydrogen bonds with the metal 
ligands E185, D222 and also D183 (Fig. 1B). 

Discussion 

HD-GYP domain-containing proteins are a large family of 
the HD superfamily of metal-dependent phosphohydro- 
lases (Aravind and Koonin, 1998). The predicted role of 



the HD-GYP domain as a PDE active against c-di-GMP 
(Galperin era/., 2001) was first demonstrated for RpfG 
from X. campestris (Ryan era/., 2006; Ryan, 2013) with 
further examples characterized from other bacteria 
such as Pseudomonas aeruginosa (Ryan era/., 2009; 
Stelitano era/., 2013), Borrelia burgdorferi (Sultan era/., 
2011), Thermotoga maritima (Plate and Marietta, 2012) 
and Vibrio cholerae (Miner etal., 2013). The crystal 
structure of a HD-GYP protein from P. marina (PmGH) 
and its complex with c-di-GMP provides the first struc- 
ture of an active HD-GYP domain protein that reveals a 
trinuclear Fe centre. The mode of binding of the cyclic 
nucleotide differs significantly from that observed for the 
more extensively characterized EAL domain containing 
PDEs (Schirmer and Jenal, 2009) while the binding 
site provides adequate room to allow both hydrolysable 
phosphates to interact in turn with the metal centre to 
complete hydrolysis of the c-di-GMP to GMP. The mode 
of c-di-GMP binding to the tri-iron site extends the diver- 
sity of structure and function of bacterial c-di-GMP 
phosphodiesterases. 
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Fig. 5. Comparison of the structures of the HD-GYP domain from PmGH and the unconventional HD-GYP domain of Bd1817. 

A. PmGH and Bd1817 PDE domains are shown in ribbon representation and coloured cyan and olive green respectively. The PmGH metal 
centre is as for Fig. 1, while the Bd1817 binuclear Fe centre with a bridging hydroxide ion is shown as green spheres. The shift in orientation 
in helices a6 and a10 in PmGH when compared with Bd1817 which allows the protein to accommodate the trinuclear metal centre is 
highlighted. 

B. Comparison of the HD-GYP domain nucleotide-binding pocket highlighting the opened and closed conformations observed for PmGH and 
Bd1817 in red and green respectively. Hydrogen bonds proposed to hold the Bd1817 in this close conformation are shown as black dashed 
lines. The double headed black arrow depicts the distance in Angstroms between the L7/8 and L10/11 loops in PmGH for which the active site 
is in an open conformation. 

C. Secondary structure superposition using protein fragments ranging from first to last residue involved in metal co-ordination of PmGH and 
Bd1817 showing only the protein metal ligands. 



Structural and chemical relationships of the trinuclear 
Fe centre 

Although all HD domains share key design features, a 
striking diversity of catalytic centres have now been iden- 
tified, containing no metal, mono- or binuclear metal 
centres, and here a trinuclear metal binding site. In PmGH 
the tri-iron site presents a structure which is distinct from 
the more extensively studied oxo-centred equilateral trian- 
gle trinuclear iron complexes, although the first protein 
structure with such an oxo-centred iron cluster was only 
determined relatively recently (Hogbom and Nordlund, 
2004). Here, the tri-iron centre of PmGH has striking simi- 
larities to a series of linear tri-iron complexes that were 
obtained while designing mimetics for studying primarily 
carboxylate-bridged bimetallic centres in metalloproteins 
(Rardin era/., 1992; Kitajima era/., 1993). However in 
PmGH the tri-iron centre has an extended V shape 
(Fig. 3C). The G and M metal sites of the tri-iron centre 
align closely with the diiron metal centre of Bd1817, an 
inactive HD-GYP protein (Lovering era/., 2011) (Fig. 5A) 
and that of other HD domain containing binuclear Fe sites 
such as myo-inositol oxygenase (Brown era/., 2006). 
Comparison of the structure of PmGH to Bd1817 shows 
the overall fold of the HD-GYP domain is maintained but 
reveals a slight re-orientation of the first and fifth a-helices 
of the 5-helix HD core domain to accommodate the tri-iron 
centre (Fig. 5Aand C). PmGH presents an accessible and 
far larger binding cavity than is the case for Bd181 7 where 



the active site is capped by a 'lid' formed by loops corre- 
sponding to L7/8 and L10/11 in PmGH (Fig. 5B). The 
Bd1817 interactions responsible for keeping the lid tightly 
fixed in place are unique to Bd1817 (Fig. 1D), suggesting 
that these mutations have evolved to intrinsically fix the lid 
in a closed conformation to prevent binding of nucleotide 
keeping the protein enzymatically inactive. Despite the fact 
that a significant number of enzymatically inactive EAL and 
HD-GYP domains have been demonstrated to play a role in 
signal transduction and regulation through c-di-GMP 
binding and/or protein interaction (Navarro era/., 2009; 
Ryan era/., 2009), the regulatory role of Bd1817 still 
remains to be revealed (Lovering era/., 2011). 

Our observation of the tri-iron centre in the HD-GYP 
domain of PmGH prompted us to perform a phylogenetic 
comparison of this domain with all relevant proteins in the 
National Center for Biotechnology Information (NCBI) 
database. This analysis showed a distinct separation of the 
HD-GYP domains into two evolutionary groups (Fig. 6). 
This bipartition is independent of the type of regulatory 
and/or sensory domain associated with the HD-GYP 
domain (Fig. 6). Analysis of the sequence alignments high- 
lighted that only seven out of the eight PmGH metal ligand 
residues were shared between these two groups. The one 
variable ligand corresponds to E185 in PmGH which pro- 
vides a bidentate carboxylate ligand which bridges the M 
and H metal sites. In the group containing all eight ligands, 
which includes PmGH and TM01 86 from Thermotoga mar- 
itima, a REC/HD-GYP protein that has been characterized 
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Fig. 6. Maximum parsimony analysis of various HD-GYP domain sequences. The evolutionary history was inferred using the maximum 
parsimony (MP) method. Tree #1 out of 17 most parsimonious trees (length = 3871) is shown. The consistency index is 0.257040 (0.255115), 
the retention index is 0.444680 (0.444680), and the composite index is 0.114300 (0.113445) for all sites and parsimony-informative sites (in 
parentheses). The MP tree was obtained using the Close-Neighbor-Interchange algorithm (Suzuki et a/., 2002) with search level 0 in which the 
initial trees were obtained with the random addition of sequences (10 replicates). The analysis involved 122 amino acid sequences. All 
positions containing gaps and missing data were eliminated. There were a total of 116 positions in the final dataset. Sequence labels are in 
the following format: (1) organism name followed by a number if more than one sequence is present from the same organism; (2) a +/- sign 
indicating Gram + or Gram -; (3) other domains present besides the HD-GYP domain are indicated, followed by a number in case of multiple 
copies; (4) a 3 letter code for the 3 residues triplet subfamily signature corresponding to positions 185-187 in PmGH. Extra domains are: 
REC = CheYhomologous receiver domain; GAF = present in cyclic di-GMP phosophodiesterase, Adenyl cyclase, Fhla; PAS = present in 
Periodic circadian protein, Ah receptor nuclear translocator protein, Single-minded protein; HAMP = present in Histidine kinases, Adenyl 
cyclases, Methyl-accepting proteins and Phosphatases; HD = extra HD domain of unknown function missing the GYP motif; TPR = domain 
containing the Teratrico Peptide Repeat region; GGDEF = diguanylate cyclase containing the GGDEF motif; SBP = bacterial extracellular 
Solute- Binding Protein; DUF = Domain of Unknown Function. Evolutionary analyses were conducted in Mega5 (Tamura era/., 2011). 
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as an active c-di-GMP phosphodiesterase(Plate and 
Marietta, 2012), the equivalent of the metal ligand E185 is 
invariably found to be conserved or replaced by an aspar- 
tate. A threonine and a glycine consistently follow this 
conserved metal carboxylate ligand (E/D), giving the sig- 
nature motif E/D-T-G for this subfamily. Conversely the 
other subfamily primarily presents a tyrosine or phenyala- 
nine (Y/F) in place of the conserved carboxylate ligand of 
E1 85. Despite the lack of a unique signature triplet as in the 
E/D-T-G group (except in some rare cases), there do exist 
some well-populated sequence clusters within this second 
Y/F subfamily identifiable by motifs such as Y-T-Y, Y/l-L-L 
or F-T-F. The Y/F mutation of the carboxylate metal ligand 
that in the PmGH structure bidentately bridges the M and H 
metal sites would be expected to impact on the formation or 
stability of the tri-iron centre. Thus, the Y/F subfamily is 
more likely to contain a binuclear metal centre in contrast 
to the E/D-T-G subgroup which are predicted to have a 
trinuclear metal centre like PmGH. The boundary between 
these two subfamilies is not entirely clear-cut however. For 
example, RpfG from X. campestris, despite phylogeneti- 
cally clustering within the E/D-T-G subgroup, aligns a 
glycine in place of the E/D residue, as well as a phenyla- 
lanine replacing the H-site metal ligand H189 of PmGH 
(Fig. 1D). The loss of two out of the eight metal ligands 
utilized by PmGH to bind the trinuclear Fe centre suggests 
that RpfG is more likely to possess a binuclear metal ion 
centre. This phylogenetic analysis, while suggesting sub- 
stantial diversity within the HD-GYP family of signalling 
proteins, at the same time identifies a potential quasi-equal 
distribution of putative bi- and trinuclear metal centres. 
Additional crystal structures of representative HD-GYP 
domain proteins will be required to clearly understand the 
structural evolution of this family of PDEs and the require- 
ment for a bi- or trinuclear metal centre for catalysis. 

Structural insights into the phosphodiesterase activity of 
the HD-GYP domain 

The structure of the PmGH-c-di-GMP complex predicts 
that the M-site Fe(lll) directly interacts with a non-bridging 
oxygen of one of the scissile phosphate diesters of the 
c-di-GMP substrate to provide a strong Lewis acid catalyst, 
while the metal-activated bridging hydroxide ion of the M-H 
Fe pair is the likely nucleophile for the hydrolysis of the 
scissile bond as similarly proposed for other metallo- 
phosphatases (Williams et al., 1999). However, this struc- 
ture does not provide a definitive answer to how the 03' 
leaving group is protonated. The closest conserved resi- 
dues to the 03' are D183, E185 and K225, with the 
carboxylate group of E1 85 situated less than 4 A away; but 
the principle role of E185 is expected to be in metal binding. 
Moreover, the E185A mutant is still capable of hydrolysing 
c-di-GMP although significantly less effectively than the 



native protein while alanine mutation of either D183 or 
K225 abolishes enzyme activity (Fig. 1 A). In the c-di-GMP- 
PmGH complex, the D183/K225 pair is over 5 A away from 
the 03' of c-di-GMP. As the structure of the c-di-GMP- 
PmGH complex only contains the G-site Fe (due to EDTA 
treatment of crystals), binding of the c-di-GMP with an 
intact trinuclear metal centre is expected to influence the 
mode of binding of the nucleotide which could place D183 
appropriately to act as a general acid for protonation of the 
03' leaving group with K225 stabilizing the unprotonated 
state of D183. Protonation of 03' by bound water has been 
proposed for EAL PDEs as well as for the HD domain 
phosphohydrolase YfbR (Zimmerman etal., 2008; 
Barends etal., 2009; Tchigvintsev etal., 2010) but in the 
structures presented here no suitably positioned water is 
observed bound. 

The complex of c-di-GMP with PmGH shows the nucleo- 
tide to bind with the guanine bases in a cis conformation 
which differs to EAL domain proteins, in which c-di-GMP 
adopts a more extended conformation (Navarro etal., 

2009) . Similar cis conformations for c-di-GMP have been 
observed in other structures, for example in complexes 
with riboswitches, the degenerate GGDEF domain protein 
PelD, the c-di-GMP effector protein domain PilZ (Habazettl 
et al., 2011 ; Smith etal., 2011 ; Li etal., 2012), and with the 
human STING proteins (Shu et al., 2012), which are stimu- 
lators of interferon genes. The c/'s-conformation together 
with the space provided by the nucleotide binding cavity is 
proposed to facilitate the sequential hydrolysis of c-di-GMP 
by HD-GYP PDEs. Based on the structure of the complex 
of PmGH with the final reaction product GMP, which shows 
the 5'-phosphate of GMP bound to the M-H Fe pair, c-di- 
GMP hydrolysis may be catalysed by these two sites alone, 
with the G-site Fe contributing to the hydrolysis rate. This 
would invoke primarily a two-metal-ion mechanism as seen 
for other PDEs (Barends etal., 2009; Tchigvintsev etal., 

2010) . In this case, the two scissile c-di-GMP phosphates 
would be sequentially hydrolysed by the M-H Fe pair with 
the phosphate group which initially binds to the hydroxyl 
group of Y285 being brought into a similar conformation 
to the phosphate group bound to the M-site Fe(lll). After 
hydrolysis of this M-site bound phosphate group, the 
hydrolysable phosphate group of 5'-phosphoguanylyl-(3'- 
5')-guanosine (5'-pGpG), the product of this first hydrolysis 
step, can be positioned in a similar conformation to the 
previously hydrolysed phosphate by a rotation around the 
original twofold axis of the intact c-di-GMP. This would then 
enable the conversion of the now bound 5'-pGpG to GMP. 
Equally, it is conceivable that the G-M Fe pair could also 
contribute to hydrolysis but the capture of the complex 
between PmGH and c-di-GMP due to the presence only of 
the G-site Fe suggests this Fe may play more of a structural 
role, although clearly a critical one as single alanine muta- 
tions of the conserved protein metal ligands of the G-site 
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Fe renders the protein inactive. In closing, a direct role in 
catalysis for all three metal sites in PmGH cannot be ruled 
out as the enzyme has adapted a novel tri-iron centre with 
the metal ions closely spaced (< 3.64 A) and single point 
mutations of any of the eight protein metal ligands abol- 
ished activity in all cases except for E185 and D305.The 
impact of these mutations which would disrupt binding of 
the M-H ion pair maybe artificially mitigated by binding of 
phosphate or carboxylate ions from the cell as the ligands 
are positioned on the solvent facing side of the metal 
centre. The mutations do impair the activity of PmGH and 
thus the predicted distribution of bi- and trinuclear centres 
in the HD-GYP family may contribute to modulation of PDE 
activity. Further structural work will be required to decipher 
unequivocally the roles of the three metal Fe sites which 
will be aided by a crystal structure of an active binuclear 
HD-GYP domain together with a global kinetic analysis of 
both the bi- and tri-nuclear metal containing HD-GYP 
subfamilies identified here. 

Structural insights into the multifunctional roles of 
HD-GYP domains 

Recent studies have described a regulatory role for 
protein-protein interactions involving the HD-GYP domain 
protein RpfG. Interaction of RpfG with specific GGDEF 
domain proteins serves to control motility in X. campestris 
(Ryan etal., 2012). Mutational analysis showed that the 
GYP motif is critical for HD-GYP::GGDEF complex forma- 
tion but not necessary for the PDE activity of RpfG against 
c-di-GMP (Ryan etal., 2010). The structure of the PmGH 
HD-GYP reveals that although the GYP containing loop is 
surface exposed and well ordered, the Y285 of the GYP 
motif is placed inside the substrate-binding pocket, where 
it H-bonds to c-di-GMP. Therefore, the structural data 
presented in this study suggests that if GGDEF domains 
interact directly with Y285, they need to intercalate with the 
inner side of the HD-GYP nucleotide-binding pocket. Such 
a mechanism would clearly prevent c-di-GMP binding and 
PDE activity, although this has not been observed in vitro 
(Ryan etal., 2012). An intriguing alternative is that RpfG 
acts as a trigger enzyme for protein complex formation and 
regulation in a similar fashion to the EAL domain protein 
YciR of Escherichia coli (Lindenberg etal., 2013). In this 
scenario, RpfG involvement in protein-protein complexes 
would be determined not only by c-di-GMP binding but also 
by conformational alterations associated with c-di-GMP 
degradation, which would be 'reported' via the GYP loop. 

With the work described here we now have structures 
of representatives of all domains involved in c-di-GMP 
metabolism. However, further structural work is required to 
gain a fuller understanding of the relative contributions of 
the amino acids making up the catalytic site of the enzyme 
and to understand how the HD-GYP domain can utilize 



both bi- and trinuclear metal centres to hydrolyse c-di-GMP 
which we propose exist here. Furthermore, the elucidation 
of the structures of additional HD-GYP domains from 
the different classes that we have defined here and of 
the HD-GYP domain in complex with the GGDEF domain 
will be necessary to provide a fuller understanding of 
the regulatory action of this diverse family of signalling 
proteins. 

Experimental procedures 

Cloning and protein production 

Initial construct design was aided by bioinformatic tools 
implemented using the OPAL and OPTIC resources as 
described by Albeck etal. (2006) as well as the XTALPRED 
and HHPRED servers (Soding etal., 2005; Slabinski etal., 
2007). The gene encoding for the GAF/HD-GYP protein from 
Persephonella marina EX-H1 (perma_0986; referred to here 
as PmGH) was optimized for structural studies through 
mutations of its two cysteines to alanines and deletion of the 
terminal three amino acids which were predicted to be 
unstructured (Remmert etal., 2012). The construct was syn- 
thesized by Genscript inserted in pUC57 and subcloned into 
pET47b using the STRU-cloning protocol (Bellini etal., 2011) 
and transformed into E. coli BL21 (DE3). BL21 (DE3) cells 
were grown in LB media and induced with 0.25 mM IPTG; 
protein overexpression was carried out at 37°C for 1 h. Puri- 
fication was achieved by Ni 2+ affinity chromatography using 
the N-terminal His6 tag followed by tag cleavage using 
recombinant HRV 3C protease. Protein was concentrated to 
10 mg mh 1 for crystallization experiments. 



Crystallization, data collection and structure solution 

Diffraction quality crystals were obtained using the hanging 
drop vapour diffusion method from mixing equal volumes of 
protein with a precipitant solution made up of 0.1 M MES 
pH 6.5, 0.9 M succinic acid and 2% PEG 2000. PmGH crys- 
tallized in space group I222 with one dimer in the asymmetric 
unit. Crystals were cryoprotected with ethylene glycol [25% 
(v/v)] prior to flash-cooling in liquid nitrogen. The structure 
was solved using SAD phashing by exploiting the anomalous 
scattering from the bound iron atoms that were identified by 
an X-ray fluorescence spectrum prior to the diffraction experi- 
ment (Walsh etal., 1999). Native and SAD data at the Fe-Ka 
absorption edge were collected on Diamond Light Source 
beamlines I02 and I24 respectively. Nucleotide-PmGH com- 
plexes were obtained by soaking PmGH crystals soaked 
overnight at 277 K with 25 mM c-di-GMP (BioLog/KeraFast) 
and 30 mM GMP both in the presence and absence of 
100 mM EDTAand data evaluated and collected on Diamond 
beamlines I02, I03, I04 and 104-1. The structure of PmGH 
was determined from a single wavelength anomalous diffrac- 
tion experiment at the Fe absorption edge determined experi- 
mentally on beamline I24 and data extended against a high 
resolution data set (2.03 A) collected from a different crystal 
using beamline I02. All data were processed using XDS 
within XIA2 (Kabsch, 2010; Winter, 2010) and the Fe sub- 
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structure was determined using SHELX (Sheldrick, 2010). 
Building of the structure was aided by automated procedures 
in Buccaneer (Cowtan, 2006) and manual building was per- 
formed with COOT (Emsley etal., 2010). The structure was 
refined with REFMAC5 (Murshudov etal., 1997) and vali- 
dated with MolProbity (Chen etal., 2010). Figures of struc- 
tures were prepared with PyMOL (Schrodinger, 2010). 

Alteration of residues in the PmGH HD-GYP domain by 
mutagenic PCR 

Site-directed mutagenesis to introduce the alterations in 
residues involved in metal binding (E185A, H189A, H221A, 
D222A, H250A, H276A, H277A, D305A and D305A), sub- 
strate binding (Y285A, R314A and K317A) and other resi- 
dues of interest (D183A, K225A, G284A, P286A, I294A and 
D308A) was done by using mutagenic PCR in a two-step 
protocol as previously described (Ryan et al., 2006; 2010). In 
the first round of PCR, two separate reactions were carried 
out by using the forward and reverse primers together with 
one of a pair of primers of complementary sequence carrying 
the desired alteration and the HD-GYP construct in pET47b 
as template. (Mutagenic primer sequences will be given upon 
request.) The products of the first round of PCR were used as 
templates for a second round of PCR with forward and 
reverse primers. 

Enzymatic assays on the HD-GYP domain and 
its variants 

The assay buffer and reaction conditions were as described 
elsewhere (Ryan etal., 2006). Briefly, a standard reaction 
mixture contained 20 ug of protein, 50 mM Tris-HCI (pH 7.6), 
10 mM MgCI 2 , 10 mM MnCI 2 , 0.5 mM EDTAand 50 mM NaCI 
in a total volume of 600 u,l. The assay mixture was warmed to 
37°C before the reaction was started by the addition of 27 ul 
of substrate to give a final concentration of 100 uM. Aliquots 
of 200 ul were withdrawn to a sterile Eppendorf tube at the 
indicated time points, and the assay was terminated by 
placing the tube in a boiling water bath for 3 min. After cen- 
trifugation at 15 000 g for 2 min, the supernatant was filtered 
through a 0.22 urn filter before analysis by reverse-phase 
HPLC on a Hewlett-Packard Model 1090 Series II HPLC 
system. Samples of 50 ul were injected into a SunFire C-1 8-T 
column (150 x 4.6 mm; Waters) and fractionated by using 2% 
(v/v) acetonitrile/98% Na phosphate buffer (pH 5.8) under 
isocratic condition at a flow rate of 0.7 ml min -1 . Nucleotides 
were detected at a wavelength of 252 nm. 

Accession numbers 

The co-ordinates and structure factors have been deposited 
in the Protein Data Bank, http://www.pdb.org PDB ID codes 
4MCW (PmGH), 4MDZ (complex with c-di-GMP) and 4ME4 
(complex with GMP). 
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